Module 4 Lecture - One-way ANOVA and Multiple Comparison Procedures

Analysis of Variance

Quinton Quagliano, M.S., C.S.P

Department of Educational Psychology

1 Overview and Introduction

Agenda

1 Overview and Introduction

2 The F Distribution

3 The F Ratio

4 One-way ANOVA

5 Mean Comparisons in One-way ANOVA

6 Conclusion

1.1 Objectives

  • Interpret the F probability distribution as the number of groups and the sample size change.
  • Discuss one use for the F distribution: one-way ANOVA
  • Conduct and interpret one-way ANOVA.
  • Appreciate the ANOVA as another test in your toolbox alongside the other tests for dealing with categorical and continuous outcome variables
  • Understand the F-distribution as another practical distribution for deriving inferences and conclusions
  • Understand the types of effect size usually employed for ANOVA
  • Introduce the different mean comparison strategies in a One-way ANOVA, to be followed up on later in next lecture

1.2 Introduction

  • Discuss: When we needed to compare the averages of two groups on a continuous variable, what statistical test would we use?
  • While we’ve already covered how to test differences between two groups (even with an assumption violation!), now we need to consider the scenario with three or more groups on some outcome
    • Examples:
      • Comparing class outcomes for those in hybrid, in-person, or online sections
      • Comparing students on three different tracks in high school: regular, advanced, or remedial
  • When we are comparing three categorical groups on some numeric, continuous outcome, we will employ the one-way ANOVA, standing for Analysis of Variance (ANOVA)
    • ANOVA is a broad family of techniques with many applications and extensions - many of which we will cover in this class!
    • We are looking at the one-way ANOVA as a specific use of this technique
  • Important: 'ANOVA is a t-test, but for 3 categories' is actually a pretty fair way to describe this, as they are actually equivalent if/when comparing only two groups.
  • Why not use a bunch of t-tests across the 3 groups instead of this new complicated method?
    • Problem: It raises our Type I error rate
    • This is something done as part of ANOVA in post-hoc or a-priori testing
  • Discuss: Describe what Type I errors are, in your own words
  • Previous tests all made use of specific, practical distributions for determining significance:
    • z-tests \(\rightarrow\) normal distribution
    • t-tests \(\rightarrow\) t-distribution
    • \(\chi^2\) tests \(\rightarrow\) \(\chi^2\) distribution
    • The ANOVA family will introduce a new distribution: the F-distribution
  • We’ll start by describing the F-distribution itself, then the F-ratio/statistic that we compute, and then finally the application within the One-way ANOVA

2 The F Distribution

Agenda

1 Overview and Introduction

2 The F Distribution

3 The F Ratio

4 One-way ANOVA

5 Mean Comparisons in One-way ANOVA

6 Conclusion

2.1 Introduction

  • The F-distribution was developed by Sir Ronald Fisher and is a unique distribution applied often for the ANOVA family of statistical tests
    • In notation it is given as \(F \sim F_{df(num),df(denom)}\) where:
    • \(df(num) \rightarrow df_{between}\) and \(df(denom) \rightarrow df_{within}\)
    • Example: \(F \sim F_{2, 24}\) has:
      • \(df_{between} = 2\)
      • \(df_{within} = 24\)
  • Important: Hold on just a second, *two* different degrees of freedom? Yes! We'll discuss more about why that is in our discussion on the F-ratio itself
  • The F-distribution actually consists of values that are squares of the t-distribution, and is derived from the t-distribution
    • How exactly Fisher proved that is way beyond the scope of this class
  • Discuss: Recall, what exactly is the distinction that makes the t-distribution different and useful over the plain normal distribution? The F-statistic benefits from this the same!

2.2 Additional Facts About the F Distribution

  • The curve is not symmetrical but skewed to the right.
    • Thus, it has a more similar look to the chi-squared distribution, compared to the t-distribution
  • There is a different curve for each set of dfs.
    • Once again, similar to chi-squared
  • The F statistic is always greater than or equal to zero.
  • Discuss: After you work through the F-ratio section and calculation below, return to this section and try to explain, using the formula, why this F is always greater than or equal to zero
  • As the degrees of freedom for the numerator and for the denominator get larger, the curve approximates the normal distribution.
    • Remember, the \(df_{between} \rightarrow\) numerator
    • The \(df_{within} \rightarrow\) denominator

3 The F Ratio

Agenda

1 Overview and Introduction

2 The F Distribution

3 The F Ratio

4 One-way ANOVA

5 Mean Comparisons in One-way ANOVA

6 Conclusion

3.1 Introduction

  • The F-ratio is just another name for the F statistic that results from the test we run
    • The F-ratio/statistic we get is a ratio of:
      • The variances between samples
      • The variances within samples
      • It’s formula is given as:

\[ F = \frac{MS_{between}}{MS_{within}} \]

  • Well describe what the \(MS\) means here in a second

3.2 Between & Within Variation

  • For Variances between samples, we are considering how much difference there is between the 3 or more groups we are comparing to one another
    • This may also be known as “variation due to treatment” or “explained variation”
    • This will be represented computationally as the \(MS_{between}\) or the mean square
    • This is the numerator of the F-ratio formula
  • In the case of Variances within samples, we example how much variation there is within each of the groups we are comparing to one another
    • This may also be known as “variation due to error” or “unexplained variation”
    • This will be represented as \(MS_{within}\) in the following formula
    • This is the denominator of the F-ratio formula

3.3 Calculation of the F-ratio/statistic

  • We are not going to calculate F by hand, as it is very time-consuming, albeit plenty possible
    • However, you should still follow along and make sure you can see the flow of information as we substitute numbers in these calculations

Overall F Calculation

\[ F = \frac{MS_{between}}{MS_{within}} \]

Between-group Calculations

\[ MS_{between} = \frac{SS_{between}}{df_{between}} \]

\[ df_{between} = k - 1 \]

\[ SS_{between} = \sum{[\frac{(s_j)^2}{n_j}] - \frac{(\sum{s_j})^2}{n}} \]

\[ SS_{total} = \sum{x^2 \cdot \frac{(\sum{x})^2}{n}} \]

  • Where
    • \(SS_{between}:\) Sum of squares between groups
    • \(SS_{total}:\) Sum of squares total
    • \(df_{between}:\) degrees of freedom between groups
    • \(k:\) the number of groups
    • \(n:\) total sample size
    • \(n_j:\) the size of \(j^{th}\) group
    • \(s_j:\) sum of values of \(j^{th}\) group

Within-group Calculations

\[ MS_{within} = \frac{SS_{within}}{df_{within}} \]

\[ df_{within} = n - k \]

\[ SS_{within} = SS_{total} - SS_{between} \]

  • Where
    • \(SS_{within}:\) Sum of squares between groups
    • \(SS_{between}:\) Sum of squares between groups
    • \(SS_{total}:\) Sum of squares total
    • \(df_{within}:\) degrees of freedom between groups
    • \(n:\) total sample size

3.4 Conclusions with the F-ratio/statistic

  • Discuss: Review: describe the general processes of hypothesis testing, starting with setting the null and alternative hypotheses and ending with a rejection or retaining of the null hypothesis. Try and use the words 'rare event' at some point in your explanation.
  • The F statistic, which we derive from the formula for whatever test we are using, is actually best described as a ratio or fraction (we’ll return to this ratio/statistic in a bit!)
    • Much like previous test statistics, we use this to determine if our results are rare enough to describe our results as unlikely if the null hypothesis is true
    • As usual, the F statistic has a corresponding p-value that tell us the probability of results due to chance under the null hypothesis
  • Question: What value do we compare p against to determine if we can reject the null hypothesis?
    • A) Alpha
    • B) Omega
    • C) Confidence Level
    • D) Beta
  • For significance testing of the F-ratio, our one-way ANOVA test will always be right -tailed due to the ratio nature of statistic, with larger numbers suggest greater variation between groups relative to the variation within groups!
    • Put another way, we are trying to see if it is far enough out on the right tail to say that it is significant!
  • Discuss: Which other distribution also always had a right-tailed application? Explain why that is using the relevant formula for that statistic.

4 One-way ANOVA

Agenda

1 Overview and Introduction

2 The F Distribution

3 The F Ratio

4 One-way ANOVA

5 Mean Comparisons in One-way ANOVA

6 Conclusion

4.1 Introduction

  • Important: In the prior calculations, we covered the mathematical computation in a one-way ANOVA, but this next part will cover the more conceptual side of things.
  • The goal of the one-way ANOVA is to determine if there are significant differences between multiple group means
    • This is done by examining the variances of the groups (as we previously discussed as part of The F Ratio)
  • This is how we will practically use the F-distribution

4.2 Assumptions

  • Much like the other tests, we need to be mindful of several assumptions that underline this statistical test

  • These assumptions are:

    • Each population from which a sample for each group is taken is assumed to be normal.
    • All samples are randomly selected and independent.
    • The populations are assumed to have equal standard deviations (or variances).
    • The factor is a categorical variable.
    • The response is a numerical variable.

4.3 Null and Alternative Hypotheses

  • In a one-way ANOVA, the null hypothesis is as follows:
    • \(H_0: \mu_1 = \mu_2 = \mu_3 ... = \mu_k\)
    • \(H_A:\) At least two of the group means \(\mu_1,\mu_2,\mu_3,..., \mu_k\) are not equal
  • Important: Be *very* careful in how you phrase the alternative hypothesis for this test, it looks a little bit different than the other tests we've covered before!
  • Effectively, a statistically significant F-statistic only tells us that a difference exists somewhere, but not where that difference lies
    • More explanation on this in the Extensions section!
  • Discuss: What is the name of type of the following plots?

4.4 Extensions

  • Of course, this is only the tip of the iceberg

  • One question that the ANOVA alone doesn't answer: which of the groups are different?

    • If we have a significant one-way ANOVA test, then we can follow up with post-hoc testing, which can tell us which groups exactly are different than one another
  • There is also the two-way ANOVA, which allows us to compare two or more independent variables and their combined and interacting impact on a single dependent variable in two-way Analysis of Variance.

    • While beyond the scope of this class, interactions are an incredibly important part of a lot of more complicated studies
  • Finally, there is the repeated-measures ANOVA

  • Discuss: Which of the t-tests also dealt with comparing two sets of measurements, taken at two separate time points, on the same sample?

4.5 Effect Size for One-way ANOVA

  • There are two measures of effect size: Eta-squared and Omega-squared

Eta-squared

  • The first effect size estimate is eta-squared, or \(\eta^2\)
    • A measure of effect size for one-way ANOVA to describe proportion of variance in the dependent variable explained by the independent variable in this sample

\[ \eta^2 = \frac{SS_{Between}}{SS_{Total}} \]

  • Discuss: What other effect size coefficient is also described as 'variance explained in one variable by another'. Hint: think of comparisons between two continuous numeric variables

Omega-squared

  • The second effect size estimate is omega-squared or \(\omega^2\)
    • A measure of effect size for one-way ANOVA to describe proportion of variance in the dependent variable explained by independent variable in the target population; unbiased effect size estimate

\[ \omega^2 = \frac{SS_{Between} - (k-1)(MS_{within})}{SS_{total} + MS_{within}} \]

  • Important: Because omega-squared tries to speak to the *population*, I generally prefer it as an effect size indicator over eta-squared

5 Mean Comparisons in One-way ANOVA

Agenda

1 Overview and Introduction

2 The F Distribution

3 The F Ratio

4 One-way ANOVA

5 Mean Comparisons in One-way ANOVA

6 Conclusion

5.1 Introduction

  • One of the core weaknesses of an ANOVA test, is that it doesn’t provide any information by itself on which groups are different than one another
    • To do these comparisons, we use contrast/comparison tests
    • Using these, we can test specific hypotheses about differences between our 3 or more groups
    • However, you should only pursue doing mean comparisons if your overall one-way ANOVA is statistically significant.
  • Important: Think about what our null hypothesis is under ANOVA. Very limited, right? Contrasts and comparisons are how we can move past that

Types of Contrasts

  • We have two types of contrasts:
    • A Priori / Planned / Orthogonal Contrasts - Planned ahead of time
    • A Posteriori / Post-hoc - Performed after ANOVA
  • Both of these types of contrasts tests make the same assumptions as the ANOVA itself
  • Important: Planned comparisons are more powerful (less stringent) than post hoc comparisons which are typically more conservative. For this reasons, I'd usually say it's best to try to make predictions prior to analyzing the data.

Concept of Weights

  • For contrast tests, weights are assigned to means (weighted means) to reflect the hypothesis.
    • The means assumed to have equal values are assigned the same weights
  • The weights are somewhat arbitrary, but they have to satisfy certain conditions:
    • The weights used in a comparison must sum to zero - put another way, they are all relative to one another.
    • Every mean has to have a weight even if it is zero
    • Larger weights represent larger hypothesized means, and means that are hypothesized to be equal should be assigned the same weights
  • Example: All of the following weights are for the same hypothesis where groups 2 and 3 are equal, and group 1 is more than each of them
G1 G2 G3
2 -1.0 -1.0
1 -0.5 -0.5
4 -2.0 -2.0
  • Important: With all of this said, contrast testing, just like normal inferential testing, should never be careless!
  • Discuss: Write a set of weight that would suffice the conditions that G1 is greater than G2 is greater than G3.

6 Conclusion

Agenda

1 Overview and Introduction

2 The F Distribution

3 The F Ratio

4 One-way ANOVA

5 Mean Comparisons in One-way ANOVA

6 Conclusion

6.1 Recap

  • We can think of a one-way ANOVA as being similar to an independent-samples t-test, but for more than 2 groups. Just note the differences in the null/alternative hypotheses setup

  • There are some important extensions and other applications of the ANOVA that we did not comprehensively cover here. However, they do still employ the F-distribution and have the underlying focus on comparing variances to make conclusions about differences between groups. We will cover those later on!

  • The F-distribution adds to the other practical distributions employed as part of hypothesis testing, such as the t-distribution and chi-squared distribution (which you learned about in EDPS-641). Just like those, it allows us to conclude whether a null hypothesis can be rejected or retained.

  • On top of performing a One-way ANOVA, we can also follow it up with various Mean Comparisons in One-way ANOVA! These help us determine where our differences exists between our groups, and can offer important insights on top of our one-way ANOVA results. We continue working through these in next lecture.

6.2 Lecture Check-in

  • Make sure to complete any lecture check-in tasks associated with this lecture!

Module 4 Lecture - One-way ANOVA and Multiple Comparison Procedures || Analysis of Variance